In one embodiment, this invention provides a nucleic acid encoding the 
carboxy-terminal portion of the heavy chain (HC) of botulinum neurotoxin (BoNT), 
the BoNT being selected from the group consisting of BoNT serotype A, BoNT 
serotype B, BoNT serotype CI, BoNT serotype D, BoNT serotype E, BoNT 
serotype F, and BoNT serotype G, wherein said nucleic acid is expressable in a 
ibinant organism selected from Escherichia coli and Pichia pastoris. 
Preferably, the nucleic acid comprises a nucleic acid sequence selected from SEQ ID 
J No:l (serotype A), SEQ ID No:7 (serotype B), SEQ ID No:9 (serotype CI), SEQ ID 
o:ll (serotype D), SEQ ID No: 13 (serotpye E), SEQ ID No: 15 (serotype F), and 
SEQ ID No: 17 (serotype G). In an alternative preferred embodiment, the nucleic 
acid encodes an HC amino acid sequence of BoNT selected from SEQ ID No:2 
(serotype A), SEQ ID No:8 (serotype B), SEQ ID No: 10 (serotype CI), SEQ ID 
No:12 (serotype D), SEQ ID No:14 (serotpye E), SEQ ID No:16 (serotype F), and 
SEQ ID No; 18 (serotype G). 

In another embodiment, this invention provides a nucleic acid encoding the 
amino-terminal portion of the heavy chain (HN) of botulinum neurotoxin (BoNT), 
the BoNT being selected from the group consisting of BoNT serotype B, BoNT 
serotype CI, BoNT serotype D, BoNT serotype E, BoNT serotype F, and BoNT 
serotype G, wherein said nucleic acid is expressable in a recombinant organism 
selected from Escherichia coli and Pichia pastoris. In a prefered embodiment, the 
nucleic acid comprises a nucleic acid sequence selected from SEQ ID No:21 
(serotype B), SEQ ID No:23 (serotype CI), SEQ ID No:25 (serotype D), SEQ ID 
No:27 (serotpye E), SEQ ID No:29 (serotype F), and SEQ ID No:31 (serotype G). 
Alternatively, the nucleic acid nucleic acid encodes an HN amino acid sequence of 
BoNT selected from SEQ ID No:22 (serotype B), SEQ ID No:24 (serotype CI), 
SEQ ID No:26 (serotype D), SEQ ID No:28 (serotpye E), SEQ ID No:30 (serotype 
F), and SEQ ID No:32 (serotype G). 

Preferably, the nucleic acid of this invention is a synthetic nucleic acid. In a 
^preferred embodiment, the sequence of the nucleic acid is designed by selecting at 
least a portion of the codons encoding HC from codons preferred for expression in a 
host organism, which may be selected from gram negative bacteria, yeast, and 
mammalian cell lines; preferably, the host organism is Escherichia coli or Pichia 



weight markers. Lane 2 is the cell lysate, lane 3 is the cell extract, lane 4 is^thecell 
extract after dialysis, lane 5 is pool of rBoNTF(Hc) positive fractmp«^fter Mono S 
column chromatography, and lane 6 is pool of rBoNTF(Hj)-i5ositive fractions after 
hydrophobic interaction chromatography. 

Figure 21 shows purification of rBoNJP(Hc) by sequential chromatography. 
(A) Mono S cation exchange chromatography of extract from P. pastoris. Proteins 
were eluted with increasing NaCJ^gradient. Fractions positive for rBoNTF(Hc) by 
Western analysis were poojedindividually and subjected to hydrophobic interaction 
chromatography (BVarfd proteins were eluted with a decreasing ammonium sulfate 
gradient. In horn panels, protein monitored by A280nm is recorded on the left axis 
and elytifm conditions are recorded on the right axis, with the gradient trace laid 
oveiHfae-chrom atogram. 

Figure 22 shows CD spectra of purified soluble ( — ) and resolubilized (-) 
rBoNTF(Hc) at 30 /ig/ml (0.62 fiM) in 10 mM sodium phosphate, pH 7.0 in a 1-cm 
path length cell. Spectra were the average of four accumulations, scanned from 260 
to 200 nm at a scan rate of 10 nm/min with a 2-s response and a 1-nm bandwidth. 
The temperature was maintained at 20°C using a Peltier thermocontrol device. 

Detailed Description of the Embodiments of this Invention 

The present inventors have determined that animals, including primates, 
may be protected from the effects of botulinum neurotoxin (BoNT) by immunization 
with fragments of the botulinum neurotoxin protein expressed by recombinant 
organisms. Specifically, peptides comprising protective epitopes from the receptor 
binding domain and/or the translocation domain, found in the carboxy terminal and 
the amino terminal portions of the heavy chain of the BoNT protein, respectively, 
are expressed by recombinant organisms transfected with expression vectors 
encoding the peptides for each serotype of BoNT. Immunization with these 
recombinantly produced peptides will elicit antibodies capable of protecting animals 
against intoxication with the BoNT of the respective serotype. 

This invention provides a genetically engineered vaccine for protection 
against botulism. The vaccine comprises fragments of the A and B toxins known as 



Synthetic gene construction is a technique used to optimize for expression in 
heterologous host systems. The base composition (i.e., percent A+T or percent 
G+C) as well as the specific codons in a gene sequence play a role in determining 
whether a gene from one organism will be optimally expressed in a different 
organism. There is a reason why certain codons are used and why some are not. 
Organisms will use the codons in which corresponding tRNAs are present. If the 
organisms do not use certain codons, they most likely lack those specific tRNAs. As 
it turns out, codons found in clostridial DNA (i.e., genes found in the genus of 
bacterial called Clostridium) are very unique both in terms of base composition (i.e., 
very high A+T base composition) and in the use of codons not normally found in E. 
coli or yeast. 

Table 1 is a chart depicting codon usage in Pichia pastoris. This table was 
generated by listing the codons found in a number of highly expressed genes in P. 
pastoris. The codon data was obtained by sequencing the genes and then listing 
which codons were found in the genes. 

From Table 1, it is clear that the amino acid residues can be encoded for by 
multiple codons. When constructing synthetic genes using P. pastoris codon usage, 
it is preferred to use only those codons that are found in the naturally occurring 
genes in P. pastoris, and it should be attempted to keep them in the same ratio found 
in the genes of the natural organism. When the clostridial gene has an overall A+T 
richness of greater than 70% and A+T regions that have spikes of A+T of 95% or 
higher, they have to be lowered for expression in expression systems like yeast. 
(Preferably, the overall A+T richness is lowered below 60% and A+T in spikes is 
also lowered to 60% or below). It is of course necessary to balance keeping the 
same codon ratio (e.g., for glycine GGG was not found, GGA was found 22% of the 
time, GGT was found 74% of the time, GGC was found 3% of the time) with 
reducing the high A+T content. In the construction of the genes, it is preferred to 
keep the A+T spikes about 55%. 

Considering codon usage for a number of organisms including E. coli* it 
turns out that a synthetic gene using E. coli codon usage also expresses fairly well in 
P. pastoris. Similarly, a synthetic gene using P. pastoris codon usage also appears 
to express very well in E. coli. 



